118 research outputs found

    A stroll along the gamma

    Full text link
    We provide the first in-depth study of the "smart path" interpolation between an arbitrary probability measure and the gamma-(α,λ)(\alpha, \lambda) distribution. We propose new explicit representation formulae for the ensuing process as well as a new notion of relative Fisher information with a gamma target distribution. We use these results to prove a differential and an integrated De Bruijn identity which hold under minimal conditions, hereby extending the classical formulae which follow from Bakry, Emery and Ledoux's Γ\Gamma-calculus. Exploiting a specific representation of the "smart path", we obtain a new proof of the logarithmic Sobolev inequality for the gamma law with α1/2\alpha\geq 1/2 as well as a new type of HSI inequality linking relative entropy, Stein discrepancy and standardized Fisher information for the gamma law with α1/2\alpha\geq 1/2.Comment: Typos correcte

    Stein-type covariance identities: Klaassen, Papathanasiou and Olkin-Shepp type bounds for arbitrary target distributions

    Full text link
    In this paper, we present a minimal formalism for Stein operators which leads to different probabilistic representations of solutions to Stein equations. These in turn provide a wide family of Stein-Covariance identities which we put to use for revisiting the very classical topic of bounding the variance of functionals of random variables. Applying the Cauchy-Schwarz inequality yields first order upper and lower Klaassen-type variance bounds. A probabilistic representation of Lagrange's identity (i.e. Cauchy-Schwarz with remainder) leads to Papathanasiou-type variance expansions of arbitrary order. A matrix Cauchy-Schwarz inequality leads to Olkin-Shepp type covariance bounds. All results hold for univariate target distribution under very weak assumptions (in particular they hold for continuous and discrete distributions alike). Many concrete illustrations are provided

    On the rate of convergence in de Finetti's representation theorem

    Full text link
    A consequence of de Finetti's representation theorem is that for every infinite sequence of exchangeable 0-1 random variables (Xk)k1(X_k)_{k\geq1}, there exists a probability measure μ\mu on the Borel sets of [0,1][0,1] such that Xˉn=n1i=1nXi\bar X_n = n^{-1} \sum_{i=1}^n X_i converges weakly to μ\mu. For a wide class of probability measures μ\mu having smooth density on (0,1)(0,1), we give bounds of order 1/n1/n with explicit constants for the Wasserstein distance between the law of Xˉn\bar X_n and μ\mu. This extends a recent result {by} Goldstein and Reinert \cite{goldstein2013stein} regarding the distance between the scaled number of white balls drawn in a P\'olya-Eggenberger urn and its limiting distribution. We prove also that, in the most general cases, the distance between the law of Xˉn\bar X_n and μ\mu is bounded below by 1/n1/n and above by 1/n1/\sqrt{n} (up to some multiplicative constants). For every δ[1/2,1]\delta \in [1/2,1], we give an example of an exchangeable sequence such that this distance is of order 1/nδ1/n^\delta

    Distances between nested densities and a measure of the impact of the prior in Bayesian statistics

    Get PDF
    In this paper we propose tight upper and lower bounds for the Wasserstein distance between any two {{univariate continuous distributions}} with probability densities p1p_1 and p2p_2 having nested supports. These explicit bounds are expressed in terms of the derivative of the likelihood ratio p1/p2p_1/p_2 as well as the Stein kernel τ1\tau_1 of p1p_1. The method of proof relies on a new variant of Stein's method which manipulates Stein operators. We give several applications of these bounds. Our main application is in Bayesian statistics : we derive explicit data-driven bounds on the Wasserstein distance between the posterior distribution based on a given prior and the no-prior posterior based uniquely on the sampling distribution. This is the first finite sample result confirming the well-known fact that with well-identified parameters and large sample sizes, reasonable choices of prior distributions will have only minor effects on posterior inferences if the data are benign

    The Adaptive Sampling Revisited

    Full text link
    The problem of estimating the number nn of distinct keys of a large collection of NN data is well known in computer science. A classical algorithm is the adaptive sampling (AS). nn can be estimated by R.2DR.2^D, where RR is the final bucket (cache) size and DD is the final depth at the end of the process. Several new interesting questions can be asked about AS (some of them were suggested by P.Flajolet and popularized by J.Lumbroso). The distribution of W=log(R2D/n)W=\log (R2^D/n) is known, we rederive this distribution in a simpler way. We provide new results on the moments of DD and WW. We also analyze the final cache size RR distribution. We consider colored keys: assume that among the nn distinct keys, nCn_C do have color CC. We show how to estimate p=nCnp=\frac{n_C}{n}. We also study colored keys with some multiplicity given by some distribution function. We want to estimate mean an variance of this distribution. Finally, we consider the case where neither colors nor multiplicities are known. There we want to estimate the related parameters. An appendix is devoted to the case where the hashing function provides bits with probability different from 1/21/2

    On Hodges and Lehmann's "6/π6/\pi result"

    Full text link
    While the asymptotic relative efficiency (ARE) of Wilcoxon rank-based tests for location and regression with respect to their parametric Student competitors can be arbitrarily large, Hodges and Lehmann (1961) have shown that the ARE of the same Wilcoxon tests with respect to their van der Waerden or normal-score counterparts is bounded from above by 6/π1.9106/\pi\approx 1.910. In this paper, we revisit that result, and investigate similar bounds for statistics based on Student scores. We also consider the serial version of this ARE. More precisely, we study the ARE, under various densities, of the Spearman-Wald-Wolfowitz and Kendall rank-based autocorrelations with respect to the van der Waerden or normal-score ones used to test (ARMA) serial dependence alternatives
    corecore